perm filename PUBS[PUB,MUS] blob sn#490510 filedate 1980-01-03 generic text, type C, neo UTF8
COMMENT āŠ—   VALID 00009 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002			INTERNAL PUBLICATIONS
C00008 00003	STAN-M-2	February, 1975				$7.10
C00015 00004	STAN-M-3	May, 1975			$8.60
C00022 00005	In general, the system works tolerably well on the restricted class  of  musical
C00027 00006	REPRINT		August, 1977		  	$3.00
C00029 00007		BIBLIOGRAPHY OF NATIONAL PUBLICATIONS
C00034 00008	James A. Moorer, "The Use of Linear Prediction of Speech in Computer Music
C00036 00009
C00038 ENDMK
CāŠ—;
		INTERNAL PUBLICATIONS

Center for Computer Research in Music and Acoustics
Artificial Intelligence Laboratory
Stanford University
Stanford, California 94305

The Stanford computer  music  group  produces  technical  memoranda,  describing
results  of  the  research done at Stanford. We can offer these memoranda to the
public, but we request that we be reimbursed for publication costs by a donation
of  the  amount  listed  by  each  memo. This donation goes exclusively into the
publication funds for the project and helps us bring this work  to  the  public.
Make  checks payable to Stanford University. The donation is tax-deductable.

Some reprints of national publications are available from Stanford at the noted
suggested prices.


STAN-M-1	July, 1974				$5.65
"Computer Simulation of Music Instrument Tones in Reverberant
Environments"
	by John M. Chowning, John M. Grey, Loren Rush,
and James A. Moorer

This is a reprint of selected portions of the NSF proposal which resulted  in  a
grant  to  the  computer  music  group for research over a two-year period.  The
following is the abstract from the memo:

Novel and powerful computer simulation  techniques  have  been  developed  which
produce  realistic  music  instrument  tones  that  can  be dynamically moved to
arbitrary positions within a simulated reverberant space of  arbitrary  size  by
means  of  computer  control  of  four  loudspeakers.   Research support for the
simulation of complex auditory signals and environments will allow  the  further
development   and   application   of  computer  techniques  for  digital  signal
processing,  graphics,  and  computer  based  subjective  scaling,  toward   the
analysis,   data   reduction,  and  synthesis  of  music  instrument  tones  and
reverberant  spaces.   Main  areas   of   inquiry   are:   1)   those   physical
characteristics  of  a  tone which have perceptual significance, 2) the simplest
data  base  for  perceptual  representation  of  a  tone,  3)  the   effect   of
reverberation  and  location  on  the  perception  of  a  tone,  and  4) optimum
artificial reverberation techniques and position and number of loudspeakers  for
producing  a full illusion of azimuth, distance, and altitude.  These areas have
been scantily investigated, if at all, and they bear on a larger  more  profound
problem  of  intense  cross-disciplinary  interest: the cognitive processing and
organization of auditory stimuli.  The advanced state of computer technology now
makes  possible  the  realization  of a small computer system for the purpose of
real-time simulation.  The proposed research includes the specification of,  and
program development for, a small special purpose computing system for real-time,
interactive acoustical signal processing.  The research in simulation and system
development  has  significant  applications  in  a  variety  of  areas including
psychology, education, architectural acoustics, audio engineering, and music.

STAN-M-2	February, 1975				$7.10
"An Exploration of Musical Timbre"
	by John M. Grey

This  is  a  reprint  of  John  Grey's  doctoral  dissertation, submitted to the
department of Psychology, Stanford University.

Due to its overwhelming complexity, timbre perception  is  a  poorly  understood
subject  in the field of auditory perception. Computer-based research tools have
been developed that appear to  be  important  for  an  investigation  of  timbre
perception.  In the work to be described, an exploratory approach was formulated
for dealing with this highly multidimensional attribute of sound. This  approach
utilized  a computer technique for the synthesis of musical timbres based on the
analysis of natural instrument tones.  This technique was useful for  generating
stimuli  in  timbre  experiments because of its to effectiveness in allowing the
investigator to specify  and  manipulate  the  physical  properties  of  complex
time-variant tones. An important discovery resulted suggesting that naturalistic
tones can be synthesized from a vastly simplified set  of  physical  properties.
These  simplified  tones  were  useful  as  stimuli in further studies on timbre
perception because of the great reduction in the number of physical  factors  to
be  considered  in  making  psychophysical  interpretations  of perceptual data.
Another study undertook to equalize a set of tones in the dimensions  of  pitch,
loudness  and  duration,  in  order to eliminate confounding factors from future
judgments on different timbres.  The simplified and matched tones were rated  by
pairwise  similarity  in  a  further  study,  and  the results were treated with
computer-based multidimensional scaling techniques to  obtain  an  interpretable
data  structure in a low dimensionality.  Three dimensions were found to explain
the similarity data.  Two were related to obvious  physical  properties  of  the
tones  (to the gross characteristics of the spectral energy distribution; and to
the  existence  of  precedent  low-amplitude,   high-frequency,   and   possibly
inharmonic energy in the initial segment of the attack). The third dimension was
interpretable either in terms of  a  physical  property  (synchronicity  in  the
attacks  of  higher harmonics) or as a higher-level distinction made between the
tones on the basis of their musical instrument family. Another  set  of  studies
next   initiated  an  exploration  of  timbre  in  terms  of  continuous  versus
categorical perception. An algorithm was designed to generate  a  set  of  tones
interpolating  between  two naturalistic timbres. Identification, discrimination
and perceptual  similarity  studies  were  performed  using  a  set  of  stimuli
generated  by  interpolations.  The  results  of  these  studies  suggested that
interpolations  were  perceived  to  be  continuous  rather  than   categorical.
Furthermore,  the timbral similarities between a partial set of the naturalistic
and  interpolated  tones  revealed  three  perceptual  dimensions  that  related
directly  to  those  found  above for the total set of naturalistic stimuli. The
first two physically-related dimensions were  found,  and  the  third  dimension
seemed  to  correspond  to  a higher-order distinction made between naturalistic
tones  and  the  interpolation-derived  tones,  this  superseding   the   family
distinction made for the total set of naturalistic tones.  A notion of timbre is
developed involving both a higher-level perceptual processing of tones that  has
access   to   stored   information  relating  to  the  distinctive  features  of
identifiable sources, and a lower-level, qualitative  perceptual  comparison  of
tones  with  respect to gross acoustical features lying outside of the domain of
specific identification. Suggestions for future research are made.

STAN-M-3	May, 1975			$8.60
"On the Segmentation and Analysis of Continuous Musical Sound by
Digital Computer"
	by James A. Moorer

This  is  a  reprint  of  James Moorer's doctoral dissertation, submitted to the
department of Computer Science, Stanford University.

The problem addressed by this dissertation  is  that  of  the  transcription  of
musical  sound by computer. A piece of polyphonic musical sound is digitized and
stored in  the  computer.  A  completely  automatic  procedure  then  takes  the
digitized  waveform  and  produces  a  written  manuscript  which  describes  in
classical musical notation what notes were played. We do not attempt to identify
the  instruments  involved.  The  program does not need to know what instruments
were playing.

It would appear that it is quite  difficult  to  achieve  human  performance  in
taking  musical  dictation. To simplify the task, certain restrictions have been
placed on the problem: (1) The pieces must have no  more  than  two  independent
voices.  (2)  Vibrato  and  glissando must not be present.  (3) Notes must be no
shorter than 80 milliseconds. (4) The fundamental frequency of a note  must  not
coincide  with  a  harmonic  of  a  simultaneously  sounding note of a different
frequency.  The first three conditions  are  not  inherent  limitations  in  the
procedures,  but were done simply for convenience. The last condition would seem
to require more study  to  determine  the  cues  that  human  listeners  use  to
distinguish,  for  example,  notes  at unison or octaves.  Numerous other lesser
restrictions were also imposed on the music to be analysed.

The method used for this analysis is a directed bank  of  sharp-cutoff  bandpass
filters.   First, a pitch detector is used to determine the harmony of the piece
at each point in time. Using the harmony information, the frequencies of a  band
of  bandpass  filters is determined so as to assure that every harmonic of every
instrument will pass through at least one of the filters.  The  output  of  each
filter is processed by a pitch detector and an energy detector. This gives power
and frequency information as  functions  of  time.   Each  power  and  frequency
function  pair  is  rated  as  to its quality. The rating takes into account the
constancy of the frequency function, the smoothness of the power  function,  and
several  other  measurements  on the functions. This rating is used to eliminate
spurious traces and null filter outputs.

Notes are then inferred from groups of power and frequency function  pairs  that
occur  simultaneously with frequencies that are harmonically related. Notes with
higher overall ratings are preferred over other note  hypotheses.  The  melodies
are  then  grouped  by  separating the notes into the higher voice and the lower
voice.  Voice crossings are not tracked. For the final manuscripting,  Professor
Leland  Smith's  MSS  program  was  used. The analysis program produces directly
input for the manuscripting program, thus the entire procedure is automated.

In addition to the above described system, many other techniques  were  examined
for  their  utility  in this task. Each technique that was explored is described
and analysed, with a description of why it was not found useful for this task.

One interesting observation is that there is considerably  more  activity  in  a
piece  of music than is perceived by the listner. This is especially common with
stringed instruments,  because  the  strings  that  are  not  being  manipulated
invariably  resonate  and  produce  sounds independently which are generally not
heard due to aural masking. This indicates  that  perhaps  we  should  use  more
perceptually-based  techniques to help determine what would actually be heard in
a piece of music, rather than determine exactly what is there, although detailed
descriptions of the contents of the piece may be useful for other purposes, such
as music education or musicology.

In general, the system works tolerably well on the restricted class  of  musical
sound.  Examples  are  shown  which  demonstrate the viability of the system for
different instruments and musical  styles.  Since  the  procedure  is  extremely
costly  in  terms  of  computer time, only a limited number of examples could be
processed. These examples are discussed with a description  of  how  the  system
could  be  improved  and  how  the  restrictions  might  be eliminated by better
processing techniques.



STAN-M-4	February, 1975			$1.80
"On the Loudness of Complex, Time-Variant Tones"
	by James A. Moorer

This memo is part of a proposal to the NSF division of Psychobiology.

This study of loudness is motivated by the discovery  that  a  set  of  complex,
time-variant  tones  appear  to behave differently with respect to loudness than
would be predicted by the methods proposed in the  literature.  It  is  possible
that  the time-variant behavior of the sounds influences the loudness, so that a
more complete theory of loudness must take this behavior into account.  We  thus
propose  to  study these data and attempt to either verify the existing theories
of loudness or formulate a more comprehensive hypothesis of  loudness,  building
upon   the   currently  existing  theories,  and  to  test  this  hypothesis  by
synthesizing new  tones,  doing  equalization  experiments,  and  comparing  the
results with the predictions of the model of loudness perception.



STAN-M-5	December, 1975		  	$3.00
"The Synthesis of Complex Audio Spectra by Means of Discrete
Summation Formulae"
	by James A. Moorer

A new family of economical and versatile synthesis techniques have been
discovered which provide a means of controlling the spectra of audio
signals that has capabilities and control similar to those of Chowning's
frequency modulation technique.  The advantages of the current methods
over frequency modulation synthesis are that the signal can be exactly
limited to a specified number of partials, and that "one-sided" spectra
can be conveniently synthesized.

NOTE: This document is no longer printed, because it is superceeded by the
Audio Engineering Society Journal article:

James A.  Moorer, "The Synthesis of Complex Audio Spectra by Means of
Discrete Summation Formulae", Journal of the Audio Engineering Society,
Volume 24, #9, November 1976, pp 717-727

Reprints of this article can be ordered directly from the Audio
Engineering Society, 60 East 42nd Street, New York, N.Y. 10017



REPRINT		August, 1977		  	$3.00
"Signal Processing Aspects of Computer Music: A Survey"
Invited Paper  for the  Proceedings  of the  IEEE,  Volume 65,  Number  8,
pp1108-1137
	by James A. Moorer

The application  of modern  digital signal  processing techniques  to  the
production and processing of musical sound gives the composer and musician
a level of freedom and precision of control never before obtainable.  This
paper surveys the use of analysis of natural sounds for synthesis, the use
of speech and vocoder techniques, methods of artificial reverberation, the
use of discrete  summation formulae  for highly  efficient synthesis,  the
concept of the  all-digital recording  studio, and discusses  the role  of
special-purpose hardware in digital music synthesis, illustrated with  two
unique digital music synthesizers.
	BIBLIOGRAPHY OF NATIONAL PUBLICATIONS

NOTE: Reprints of some of these are available from Stanford at costs
listed below.

John M. Chowning, "The Simulation of Moving Sound Sources", Journal of the
Audio Engineering Society, Volume 19, #1, 1971

John M. Chowning, "The Synthesis of Complex Audio Spectra by Means of
Frequency Modulation", Journal of the Audio Engineering Society, Volume
21, # 7, September 1973, pages 526-534

James A. Moorer, "The Optimum Comb Method of Pitch Period Analysis of
Continuous Digitized Speech", IEEE Trans. on Acoustics, Speech, and Signal
Processing, Vol. ASSP-22, #5, October 1974, pp330-338

James A. Moorer, "On the Transcription of Musical Sound by Digital
Computer". Presented at the Second USA-JAPAN Computer Conference, August,
1975, reprinted in the Computer Music Journal, Volume 1, #4, November
1977, pp32-38

James A. Moorer, "The Synthesis of Complex Audio Spectra
by Means of Discrete Summation Formulae", Journal of the Audio Engineering
Society, Volume 24, #9, November 1976, pp 717-727 (this superceeds memo
STAN-M-5)

John M. Grey, "Multidimensional Perceptual Scaling of Musical Timbres",
Journal of the Acoustical Society of America, Volume 61, #5, May 1977,
pp1270-1277

John M. Grey, James A. Moorer, "A Perceptual Evaluation of Synthetic Music
Instrument Tones", Journal of the Acoustical Society of America,
Volume 62, pp454-462, August 1977

James A. Moorer, "Signal Processing Aspects of Computer Music - A Survey".
Invited Paper, Proceedings of the IEEE, Volume 65, #8, August, 1977,
pp1108-1137.  Reprinted in Computer Music Journal Vol. 1, 1, 1977.
(Reprint available from Stanford at $3.00)

James A. Moorer, "The Use of the Phase Vocoder in Computer Music
Applications".  Journal of the Audio Engineering Society, 1978

John M. Grey, John W. Gordon, "Perceptual Effects of Spectral
Modifications on Musical Timbres", Journal of the Acoustical Society of
America, Volume 63, #5, May 1978, pp1493-1500

John W. Gordon, John M. Grey, "Perception of Spectral Modifications on
Orcestral Instrument Tones," Computer Music Journal, Volume 2, #1, July
1978, pp24-31

James A. Moorer, "How Does a Computer Make Music?". Computer Music
Journal, Volume 2, Number 1, July 1978, pp32-37

John M. Grey, "Timbre Discrimination in Musical Patterns," Journal of the
Acoustical Society of America, Volume 64, #2, August 1978, pp467-472

James A. Moorer, "On the Coding of High-Quality Digitized Sound".
Presented at the 1979 European Conference of the Audio Engineering
Society, Brussels, Belgium, February 1979, Accepted for publication in the
Audio Engineering Society

James A. Moorer, "The Use of Linear Prediction of Speech in Computer Music
Applications".  Journal of the Audio Engineering Society, Volume 27, #3,
March, 1979, pp134-140. Preprinted in French in "Journees Des Etudes",
Festival du Son, June, 1979.

James A. Moorer, "About this Reverberation Business", Computer Music
Journal, Volume 3, #2, June 1979, pp13-28

James A. Moorer, "The 4C Machine", with A. Chauveau, C. Abbott, P. Eastty,
and J. Lawson, Computer Music Journal, Volume 3, #3, September 1979,
pp16-24

James A. Moorer, "HELP!", Letter to the Editor, Computer Music Journal,
Volume 3, #3, September 1979, p4

	IN PREPARATION

John M. Grey, "Perceptual Continuity of Interpolations Between Musical
Timbres", for the Journal of the Acoustical Society of America

John M. Grey, "Multidimensional Scaling of Interpolated Music Instrument
Tones", for the Journal of the Acoustical Society of America